ABSTRACT
Twitter has become one of the most popular microblogging sites for people to broadcast (or "tweet") their thoughts to the world in 140 characters or less. Since these messages are available for public consumption, one may expect these tweets not to contain private or incriminating information. Nevertheless we observe a large number of users who unwittingly post sensitive information about themselves and other people for whom there may be negative consequences. While some awareness exists of such privacy issues on social networks such as Twitter and Facebook, there has been no quantitative, scientific study addressing this problem.
In this paper we make three major contributions. First, we characterize the nature of privacy leaks on Twitter to gain an understanding of what types of private information people are revealing on it. We specifically analyze three types of leaks: divulging vacation plans, tweeting under the influence of alcohol, and revealing medical conditions. Second, using this characterization we build automatic classifiers to detect incriminating tweets for these three topics in real time in order to demonstrate the real threat posed to users by, e.g., burglars and law enforcement. Third, we characterize who leaks information and how. We study both self- incriminating primary leaks and secondary leaks that reveal sensitive information about others, as well as the prevalence of leaks in status updates and conversation tweets. We also conduct a cross-cultural study to investigate the prevalence of leaks in tweets originating from the United States, United Kingdom and Singapore. Finally, we discuss how our classification system can be used as a defense mechanism to alert users of potential privacy leaks.
- A. Acquisti and R. Gross. Imagined communities: Awareness, information sharing, and privacy on the Facebook. Privacy Enhancing Technologies (PET) Lecture Notes in Computer Science, 4258:36--58, 2006. Google ScholarDigital Library
- AFNER-Named Entity Recognition. http://afner.sourceforge.net/.Google Scholar
- AlchemyAPI. http://www.alchemyapi.com/company/.Google Scholar
- S. Bhagat, G. Cormode, B. Krishnamurthy, and D. Srivastava. Prediction promotes privacy in dynamic social networks. In WOSN'10 Proceedings of the 3rd conference on Online social networks, 2010. Google ScholarDigital Library
- Z. Cheng, J. Caverlee, and K. Lee. You are where you tweet: A content-based approach to geo-locating Twitter users. In CIKM, Toronto, Canada, 2010. Google ScholarDigital Library
- L. David. Naive (Bayes) at forty: The independence assumption in information retrieval. In Proceedings of ECML-98, 10th European Conference on Machine Learning, pages 4--15. Chemnitz, DE: Springer Verlag, Heidelberg, DE, 1998. Google ScholarDigital Library
- C. Dwyer, S. R. Hiltz, and K. Passerini. Trust and privacy concern within social networking sites: A comparison of Facebook and MySpace. In Proceedings of the Thirteenth Americas Conference on Information Systems, Colorado, August 2007.Google Scholar
- H. Gao, J. Hu, C. Wilson, Z. Li, Y. Chen, and B. Y. Zhao. Detecting and characterizing social spam campaigns. In IMC '10 Proceedings of the 10th annual conference on Internet measurement, New York, 2010. Google ScholarDigital Library
- J. M. Gomez-Hidalgoy, J. M. Martin-Abreuy, J. Nievesx, I. Santosx, F. Brezox, and P. G. Bringas. Data leak prevention through named entity recognition. In Proceedings of the 1st InternationalWorkshop on Privacy Aspects of Social Web and Cloud Computing, 2010. Google ScholarDigital Library
- L. Humphreys, P. Gill, and B. Krishnamurthy. How much is too much? Privacy issues on Twitter. In Conference of International Communication Association, Singapore, June 2010.Google Scholar
- Introduction to content analysis. http://writing.colostate.edu/guides/research/content/pop2a.cfm.Google Scholar
- C. Jernigan and B. Mistree. Gaydar: Facebook friendships expose sexual orientation. First Monday, 14(10), 2009.Google Scholar
- B. Meeder, J. Tam, P. G. Kelley, and L. F. Cranor. RT@ IWantPrivacy: Widespread violation of privacy settings in the Twitter social network. In Web 2.0 Privacy and Security Workshop, IEEE Symposium on Security and Privacy, 2010.Google Scholar
- E. Mills. Twitter user says vacation tweets led to burglary. http://news.cnet.com/8301-1009_3-10260183-83.html, June 2008.Google Scholar
- Natural language toolkit. http://www.nltk.org/.Google Scholar
- Privacy, schmivacy! Twitter now lets you broadcast your location too... http://www.csmonitor.com/From-the-news-wires/2010/0311/Privacy-Schmivacy!-Twitter-now-lets-you-broadcast-your-location-too, March 2010.Google Scholar
- P. Singla and M. Richardson. Yes, there is a correlation: - from social network to personal behavior on the web. In WWW '08: Proceedings of the 17th international conference on World Wide Web, New York, 2008. Google ScholarDigital Library
- Big goals, big game, big records. http://blog.twitter.com/2010/06/big-goals-big-game-big-records.html, June 2010.Google Scholar
- Y. Wang, S. Komanduri, P. Leon, G. Norcie, A. Acquisti, and L. Cranor. "I regretted the minute I pressed share": A qualitative study of regrets on Facebook. In Symposium on Usable Privacy and Security, July 2011. Google ScholarDigital Library
Index Terms
- Loose tweets: an analysis of privacy leaks on twitter
Recommendations
Disinformation Warfare: Understanding State-Sponsored Trolls on Twitter and Their Influence on the Web
WWW '19: Companion Proceedings of The 2019 World Wide Web ConferenceOver the past couple of years, anecdotal evidence has emerged linking coordinated campaigns by state-sponsored actors with efforts to manipulate public opinion on the Web, often around major political events, through dedicated accounts, or “trolls.” ...
A sentiment analysis of audiences on twitter: who is the positive or negative audience of popular twitterers?
ICHIT'11: Proceedings of the 5th international conference on Convergence and hybrid information technologyMicroblogging is a new informal communication medium of blogging that differs from a traditional blog in which content is much shorter. Microbloggers post about topics that describe their current status. Twitter is a popular microblogging service and ...
Analyzing and predicting viral tweets
WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide WebTwitter and other microblogging services have become indispensable sources of information in today's web. Understanding the main factors that make certain pieces of information spread quickly in these platforms can be decisive for the analysis of ...
Comments